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ABSTRACT 

A method for the recovery of the real space hne-of-sight mass density field from 
Lyman absorption in QSO spectra is presented. The method makes use of a Lucy- 
type algorithm for the recovery of the Hi density. The matter density is inferred from 
the Hi density assuming that the absorption is due to a photoionized intergalactic 
medium which traces the mass distribution as suggested by recent numerical simula- 
tions. Redshift distortions are corrected iteratively from a simultaneous estimate of 
the peculiar velocity. The method is tested with mock spectra obtained from N-body 
simulations. The density field is recovered reasonably well up to densities where the ab- 
sorption features become strongly saturated. The method is an excellent tool to study 
the density probability distribution and clustering properties of the mass density in 
the (mildly) non-linear regime. Combined with redshift surveys along QSO sightlines 
the method will make it possible to relate the clustering of high-redshift galaxies to 
the clustering of the underlying mass density. We further show that accurate estimates 
for (r^bar/i'^)^ J^^ H{z)^^ and higher order moments of the density probability function 
can be obtained despite the missing high density tail of the density distribution if a 
parametric form for the probability distribution of the mass density is assumed. 

Key virords: cosmology: theory, observation , dark matter, large-scale structure of the 
Universe — intergalactic medium — quasars: absorption lines 



1 INTRODUCTION 

The Lyman forest in QSO absorption spectra is now gener- 
ally believed to be due to absorption by large-scale neutral 
hydrogen (Hi) density fluctuations of moderate amplitude in 
a warm photoionized intergalactic medium (IGM) . This rel- 
atively new paradigm for the Lya forest differs considerably 
from the conventional "cloud" picture which had been advo- 
cated for two decades and in which the Lyman forest is due 
to a superposition from discrete absorbers with small cross 
sections (see Rauch 1998 for a review). The new picture pic- 
ture had been investigated using analytical calculations (e.g. 
Bond, Szalay & Silk; Mc Gill 1990; Bi, Borner & Chu 1992) 
but was only generally accepted after the coherence length 
of the absorbing structure could be measured accurately and 
turned out to be of order several hundred kpc (Dinshaw et 
al. 1994, e.g. Bechthold et al. 1994). The picture was then 
further sustained by (hydrodynamical) cosmological simu- 
lations of gas in dark-matter dominated universes (Cen et 
al. 1994; Petitjean, Miicket & Kates 1995; Zhang, Anninos 
& Norman 1995; Hernquist et al. 1996; Miralda-Escude et 
al. 1996). Important results of these simulations are a tight 



correlation between the Hi and the dark matter distribution 
(on scales larger then the Jeans length of the IGM) and a 
simple temperature-density relation for the IGM which de- 
pends only on the reionization history of the universe (e.g. 
Hui & Gnedin 1997, Haehnelt & Steinmetz 1998). 

As a consequence of this new picture the Lyman forest 
can be used to probe the distribution and clustering of dark 
matter at high redshift. For example, Bi & Davidsen (1997) 
have developed a simple analytic model for the IGM to gen- 
erate artificial QSO absorption spectra for a variant of the 
cold dark matter (CDM) cosmogony. They were able to re- 
produce the characteristic properties of observed absorption 
spectra. Croft et al. (1998) showed that absorption spectra 
provide important information on the shape and the am- 
plitude of the power spectrum of mass fiuctuations. Croft 
et al. used the following procedure. They first obtained a 
linearized fiux distribution by applying the Gaussianization 
scheme proposed by Weinberg (1992). They then inferred 
the shape of the linear power spectrum of dark matter den- 
sity fluctuation from the power spectrum of the linearized 
fiux and used mock spectra from numerical simulations to 
determine the amplitude of the power spectrum. Gnedin & 
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Hui (1998) used mock spectra generated from simulations of 
coUisionless particles run with a particle-Mesh code modified 
to mimick pressure efTects of the gas to investigate the effect 
of amplitude and the power spectrum of dark-matter fluctu- 
ations on the column density distribution of Lya absorption 
systems. 

In this paper a complementary approach is taken. We 
propose to use an analytical model of the IGM for a direct 
inversion of the absorption features in QSO spectra. This 

approach has the advantage that no particular cosmological 
model has to be assumed. Furthermore, the actual real space 
density along the LOS and its probability distribution can be 
studied and a direct hnk to other observations is posssiblc. 

The paper is organized as follows. In Section 2 we do- 
scribe the assumed model for the Lyman absorption. Section 
3 presents the inversion algorithm for recovering the real 
space dark matter (DM) density along the l.o.s., concen- 
trating on the LyQ forest. In section 4 wo tost the inversion 
procedure with mock spectra generated from numerical sim- 
ulations of collisional dark matter and show how to estimate 
higher order moments of the density probability distribu- 
tion and (fibarft'^)^ H{z)~^ (the parameter combination 
of baryonic density Slbar^i^, ionizing flux J and Hubble con- 
stant H{z) which determines the mean flux level in QSO 
absorption spectra). In section 5 we discuss possible appli- 
cations and give our conclusions. 



2 LYMAN ABSORPTION BY A 

PHOTOIONIZED INTERGALACTIC 
MEDIUM 

Absorption spectra arc normally presented in the form of 
a normalized flux which (neglecting noise and instrumental 
broadening) can be related to the optical depth as 



F{w) = /obs(w)/^'cont(w) = e" 



-t{w) 



(1) 



where t is the optical depth, w is the redshift space coor- 
dinate, /obs(ty) is the observed flux and 7cont(w) is the flux 
emitted from the quasar which would be observed without 
intervening absorption and has to be estimated from the 
data as well. 

The optical depth in redshift space due to resonant Lya 

scattering is related to the neutral hydrogen density, n^^^, 
along the line-of-sight (l.o.s) in real space as (Gunn & Pe- 
terson 1965, Bahcall & Salpeter 1965) 



t{w) = ao 



H{z) 



■Vp{x),h{x)]<\x, (2) 



where ctq is the effective cross section for resonant line 
scattering, H{z) is the Hubble constant at redshift z, x 

is the real space coordinate (in kms~^), His the Voigt 
profile normalized such that J Ti. dx — 1, Vp{x) is the 
l.o.s. peculiar velocity, and b(x) is the Doppler parameter 
due to thermal/turbulent broadening. For moderate optical 
depths the Voigt profile is well approximated by a Gaus- 
sian, Ti. = cxp [—{w — X ~ Vp{x))^ /h^). Assuming that 
hydrogen is highly ionized and in photoionization equilib- 
rium (rijjj oc J~^) we have 
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where rij^j is the neutral hydrogen density at mean total 
gas density and 1.56 < a < 2 (Hui&: Gnedin 1997). The 
second relation assumes that the gas density traces the dark 
matter density. At the low densities considered here shock 
heating is not important ant the gas is at the photoionization 
equilibrium temperature. The Doppler parameter can then 
be related to the total gas density and temperature as 
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where T is the temperature at the mean density and < 
/3 < 0.31 (Hui & Gnedin 1997). 

Combining equation (2), (3) and (4) we get 



t{w) = A{z) 



f Pdm(x )^" 
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where fibar and Q,max are the baryonic and total matter 
density in terms of the critical density and P is the pho- 
toionization rate per hydrogen atom (P is related to the 
flux of ionizing radiation J,, as P = 47r J Av a^, J^/hv, 
where cr^ is the hydrogen absorption cross section). To ob- 
tain the second relation in equation (6) we have have taken 
arec = 4.7 X 10"^'^(r/10*K)"" '^cm"'^s"^ for the hydrogen 
recombination coefficient, ao = 4.5 x 10~'^*cm'^ as the ef- 
fective cross section for resonant Lya scattering and used 
the high redshift approximation for the Hubble constant, 
Hiz) = HoQT.A^ + zf^'- 



3 THE INVERSION ALGORITHM 
3.1 The bsisic iterative scheme 

The proposed scheme for recovering the l.o.s. density from 
the flux is motivated by Lucy's method (Lucy 1974). In or- 
der to demonstrate the method let us, for the time being, 
consider the case where r <C 1 and Vp = 0. In this case, we 
have 



1 - F{w) 



{x)G{w,x)dx. 



(7) 



This is a linear integral equation with a positive Kernel 
G{w, x) = Tl[w—x, 6(a;))]cro c/H{z) and can be solved for rim 
using Lucy's iterative method. This is done in the following 
way. Divide w into equal bins of size Aw (the data are actu- 
ally given in bins of w anyways) and set fi — 1 — Fi. Provide 
an initial guess for {nm)j, say (nni)" = 1 — Fj H{z)/{aoc). 
Denote the values of (nHi)j at the rth iteration by {nm)j 
and evaluate the sum 



/r = ^(nm) 



where 



Gij Aw, 



(8) 
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The (r + 1)— th estimate of {nm)i is then 



('^HI)i = 



(9) 



(10) 



where G is some Kernel which in principle may be chosen to 
be different from G. Lucy's method, however, uses G = G. 
The averaging over 2m + 1 adjacent bins mitigates the effect 
of noise in the data. We work with m — 3. 

Now we generalise the iteration scheme to the case 
where r is not necessarily small. In this case, the relation 
between F and r is nonlinear. However, since the relation is 
monotonic, we can still use a modified version of equation 



/r = 1 - exp[- ^(7iHi)j cr, Ai 



(11) 



How do we decide when to stop the iterations? Suppose 
that in the j— th bin, the l-cr error in the measurement of 
Fj is cTj. Let us define the quantity 



E 



(12) 



We choose to stop at the r— th iteration when the value of 
drops below the number of data bins. It implies that the 
observed fiux is consistent with being a noisy realisation of 
the reconstructed fiux. 



3.2 Recovery of the matter density in redshift 
space 

We write (^ in terms of the redshift coordinate s = x + 
Vp{x), instead of x. If the coordinates x and s are related by 
a one-to-one mapping, then the velocity and the real space 
coordinate, can be expressed uniquely as functions of the 
redshift coordinate s. Working with s, equation ^ takes 
the following form 



r{w) 



H{z) 



-Vp{s)] 



1 



d«p(s) 



ds 



H[w — s, b{s — Vp{s))] ds 



(13) 



For brevity, we have maintained the same notation for the 
peculiar velocity as a function of s. According to the conti- 
nuity equation, the density in redshift space, n^^, is related 
to the density in real space by 



= n^As - Vp{s)] 



Therefore AlSh becomes 



dup(s) 



ds 



t{w) — / K{'w, s) n^^(s) ds 



(14) 



(15) 



where K{w,s) = Tl[w — s,b{s — Vp{s))]aoc/H{z) and the 
iterative scheme described in the last section can be applied 
to obtain the redshift space density along the l.o.s. The pa- 
rameter b depends on (s — Vp) implicitly via n^j. Because b 



is a weak function of n^j, we assume that b is constant in 
the reconstruction algorithm. This will greatly simplify the 
application of the algorithm. Tests of the algorithm, how- 
ever, are done using mock spectra generated with varying b 
according to equation (^. 

To relate the Hi density to the matter density we use 
equation (^ and (^. We still need to know the value of A 
to obtain Pdm / Pom- A has been estimated by adjusting the 
mean flux of mock spectra generated from hydrodynamical 
simulations to match the mean flux of observed QSO absorp- 
tion spectra (Ranch et al. 1997). The accuracy and possible 
biases of this determination have not yet been investigated 
in detail (see also Weinberg et al. 1997), but we consider the 
value of A (oc (f^bar/i^)^ H{z)~^) to be presently known 
with about 30 percent accuracy. For most of the paper we 
therefore assume that we know the value of ^ . In section 4.3 
we show how the value of A (oc (fJbar'i^)'^ H{z)~^) can 
be estimated directly from the distribution of the recovered 
quantity A^^ {pdm / pom) by assuming a parametric form 
for the probability distribution of the matter density. 



3.3 From redshift space to real space 

The correction for redshift distortions requires knowledge 
of the velocity field. The 3-dimensionaI velocity and mass 
density fluctuations are tightly related. However, non-linear 
velocity-density relations which are easy to implement gen- 
erally relate real space quantities, as does the relation (^ 
between the Hi and matter density. We have therefore to re- 
sort to an iterative procedure to derive the real space mat- 
ter and Hi densities from the estimated redshift space Hi 
density. Two issues need to be stressed here: (i) the veloc- 
ity field is not uniquely specified by the l.o.s. density field 
but influenced by the unknown 3D matter distribution, and 
(ii) redshift distortions can and do result in multi-valued 
zones where regions which do not overlap in real space are 
mapped onto the same redshift coordinate. In appendix A 
we describe a method which resembles Wiener flltering and 
addresses the flrst point. It allows us to obtain the most 
probable velocity field which is consistent with the estimated 
l.o.s. DM density field and has the statistical properties of a 
gravitationally clustering Gaussian random field of a given 
power spectrum. In principle the power spectrum can also 
be determined from the recovered density field but for sim- 
plicity we have chosen to adopt a power spectrum a priori. 
The results do not change much for reasonable choices of the 
assumed power spectrum. Multi-valued zones mainly occur 
in regions of high density where the inversion is difficult due 
to saturation effects in the spectrum. We did not try to cor- 
rect for this effect, but discuss some of the biases introduced 
by peculiar velocities in section 4. 

The iterative scheme to correct for redshift distortions 
which we have adopted can be summarized as follows: 

(i) Assume that the redshift-space and real-space Hi den- 
sity are equal and use (^) to compute the matter density 
field along the l.o.s. 

(ii) Estimate the peculiar velocity along the l.o.s. using 
the method described in the appendix. 

(iii) Use the estimated peculiar velocity field to correct 
for redshift distortions in the density field. 
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Figure 1. Normalized flux (top), density (middle) and velocity 
(bottom) along one line-of-sight through the simulation box at 
2 = 3. The dotted curve in the top panel shows the flux with 
peculiar velocities set to zero and the dotted curve in the middle 
panel shows the corresponding recovered density field. 

(iv) Use the corrected density field to obtain a better es- 
timate for tlie peculiar velocity field. 

(v) Repeat steps (ii)-(iv) until convergence is achieved. 

The scheme proved to be efficient; it typically converges 
after a few iterations. Iterative schemes of this kind have 
been used for correcting redshift distortions in galaxy red- 
shift survey (e.g. Yahil et. al. 1991). 



4 TESTS WITH NUMERICAL SIMULATIONS 

Ideally one would like to test the method with a large high- 
resolution hydrodynamical simulation of gas and dark mat- 
ter. However, simulations of collisionless particles are still 
vastly superior in dynamical range and especially speed. We 
have therefore chosen to test our inversion algorithm with a 
high resolution N-body simulation of pure collisionless dark 
matter particles where we used the relations in section 2.1 to 
obtain the Hi distribution from the smoothed dark matter 
density field. As demonstrated e.g. by Gnedin (1998) such a 
procedure results in a realistic Hi distribution (apart from 
the high-density regions which are not of interest here). 

The simulation used was run on the Cray T3E parallel 
supercomputer in Garching with a modified version (Mac- 
Farland et. al. 1997) of Couchman's P'^M code (Couchman 
et. al. 1995). The initial conditions were generated from the 
power spectrum for a standard cold dark matter (CDM) 
universe with fl — 1 and .ffo = 50km/s cubic box of co- 
moving length of 60h^^Mpc. The simulation was normalised 
such that the linear rms density fluctuations in a sphere of 
800km was 0.5 at redshift z = 0. The softening parame- 
ter was 13.2% of the mean particle separation and the mesh 
size was A'^ — 512 in one dimension. 




W [km/s] 

Figure 2. Top panel: Normalized flux (solid curve) in a line-of- 
sight through the simulation at 2 = and the flux correspond- 
ing to the density and peculiar velocity filed recovered with a 
Lucy- type inversion algorithm (dotted curve). The noise added 
to the input spectrum is shown separately at the bottom of the 
panel. Second panel: True real space density in the line-of-sight 
(solid curve) and recovered redshift-space matter density (dotted 
curve). Third panel: True real-space density (solid curve, same 
as in second panel) and real-space matter density recovered with 
applying redshift corrections (dotted curve). Bottom:panel The 
true l.o.s. peculiar velocity field (solid curve). The peculiar veloc- 
ity estimated from the recovered density in redshift space (dotted 
curve). For comparison, also shown is the velocity recovered from 
the true real space density (dashed line). 



4.1 Inverting mock spectra 

We have generated mock spectra from velocity and density 
fields along lines of sight randomly drawn from the simula- 
tion box. We used the output of the simulation at two red- 
shifts, z = 3 and 2 = 0, to investigate the effect of varying 
the fluctuation amplitude of the density fleld. We emphasize 
here that the mock spectra generated from the simulation at 
z = G are not meant to resemble absorption spectra at z = 0. 
The density and velocity fields in the simulations show sig- 
nificant variations in structure and amplitude at the two 
redshifts and the z = spectra are investigated with the 
same mean flux as the z = ?> spectra in order to test the 
inversion algorithm under varying conditions. 

Reliable velocity and density fields cannot be derived 
from the simulation on scales much smaller than the mean 
particle separation. We therefore applied Gaussian smooth- 
ing of widths 0.3h~^Mpc and 0.6h~^Mpc, respectively, to 
the outputs of the simulation at z = 3 and z = Q. Both 
smoothing scales correspond to the same physical scale 
(60kms-^). The rms values of the smoothed density fluctu- 
ations were 0.89 and 3.52, respectively. The density of neu- 
tral hydrogen was assumed to follow the relation (^ with 
Q = 1.7. The absorption optical depth was computed ac- 
cording to equation (^), (^, and (^ with 60 = 30km s~^. 
For both redshifts the value of A was adjusted so that the 
mean normalized flux of a large ensemble of mock spectra 
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Figure 3. The cumulative probability distribution of the true 
density field (solid curve) and that of the density field recovered 
(dotted curve) from mock spectra generated from the simulation 
at 2 = (left) and z = 3 (right). The triangles are the corre- 
sponding parametric form derived from an Edgoworth expansion 
of the PDF (|l7|) as explained in the text. 



was 0.69 (about the observed mean flux at z — 3 as deter- 
mined by Ranch et al. 1997). The spectra were convolved 
with a Ganssian of 8km/s FWHM to mimick instrumental 
broadening and photon/read-out noise was added with a to- 
tal S/N=50 per 3km/s pixel. 

The solid curve in the top panel of Fig.l is a mock spec- 
trum generated from the simulation output at z — 3. The 
dotted curve is the same spectrum with the peculiar veloci- 
ties set to zero. The peculiar motions do not only introduce 
a systematic shift but also a narrowing of the absorption 
features (see also Weinberg et al. 1998 for a discussion of 
this point). Furthermore, unsaturated lines get deeper due 
to the enhanced clustering in redshift space. The mean flux 
is also affected. For constant A it increased from 0.66 to 
0.69 including peculiar motions while the rms flux fluctua- 
tions increased from 0.31 to 0.35. 

Figure 2 shows some typical results of our inversion pro- 
cedure. In the top panel a mock spectrum is plotted (solid 
line) together with the spectrum corresponding to a recov- 
ered density and peculiar velocity field (dotted line). The 
smoothness of the dotted curve demonstrates that our in- 
version procedure succeeds in mitigating the noise in the 
data (which for clarity is shown separately at the bottom of 
the panel). 

Let us now have a look at the DM density field recovered 
from a mock spectrum generated with no peculiar velocities 
as shown in the middle panel of Figure 1. In this case the 
correspondence is excellent up to densities where the absorp- 
tion features become heavily saturated (for Lya absorption 
at a overdensity of about 3 at 2; = 3. Without peculiar ve- 
locities our inversion procedure works as well as we could 
reasonably expect. The obvious way to extend the inversion 
procedure to regions of higher density would be to use the 
higher order Lyman series lines which have smaller effec- 
tive absorption cross sections and therefore saturate at in- 
creasingly higher densities (Cowie, private communication). 
However, as we discuss below peculiar velocities significantly 
affect the quality of the recovered density field, especially in 
high density regions. The incorporation of higher order Ly- 
man series lines in the inversion procedure will therefore be 
difficult and we leave this to future work. 



The second panel of Fig. 2 shows the DM density field 
recovered from a mock spectrum generated with peculiar 
velocities but before correcting for redshift distortions. As 
expected the correspondence between true and recovered 
density is degraded compared to the ceise where the spec- 
trum was generated with the peculiar velocities set to zero. 
The main features of the density field are still recovered but 
there is a significant shift between true and recovered den- 
sity in real space and there of course remains the systematic 
underestimate of the density in saturated regions. 

The third panel shows the reconstructed density for our 
full iterative procedure including the correction of redshift 
distortions as described in section 2.4. The shift between 
true and recovered density in real space is reduced and the 
overall correspondence has significantly improved. This in- 
dicates that we succeed in removing a major fraction of 
the redshift space distortions. The recovered and true l.o.s. 
velocity fields are shown in the bottom panel. In all cases 
shown we have used the correct value for A, i.e. the value 
used to generate the mock spectrum. It should have become 
clear in this section that peculiar velocities significantly in- 
fluence the quality of the recovery. 

4.2 The density probability distribution 

For a statistical analysis of the recovered density field we 
will use the probability distribution function in differential 
(PDF) and cumulative form (CPDF). Let us for the time 
being still assume that the true value of A is known. We 
will discuss how A can be estimated from the PDF of the 
recovered density field in the next section. 

Fig. 3 shows the CPDF for the true and recovered DM 
density field for the simulations at z = 3 and z = 0. At low 
densities both curves correspond very well but the devia- 
tions become large at high densities due to saturation effects 
which confirms the visual impression from figure (|l|) and . 
We further quantify the differences between the CPDFs of 
true and recovered density field in terms of a number of 
moments of the density distribution in table ^. The bias in 
the recovered density introduces large discrepancies between 
the estimated and true values of the moments. For example, 
the mean and rms values estimated at z = are: 0.614 and 
0.608; instead of the true values: 1 and 3.52. 

What we would like to have is a simple parametric form 
for the PDF of the DM density. This could then be used 
to correct for the biases in the recovered density field. It 
has been suggested (e.g. Kofman et. al. 1994, Coles & Jone 
1991.) that the PDF of the density field of a gravitationally 
clustering Gaussian random field in the mildly non-linear 
regime is well described by a log-normal PDF. In that case 
the quantity 

= [ln(l + S)- Mi]/At2 (16) 

has a normal (Gaussian) PDF, where fxi and fi2 are the av- 
erage and rms values of ln(l + S). In figure (^) we compare 
the PDF of the DM density in the simulations (filtered with 
Gaussian windows of widths Rs =1.2Mpc and Rs =5Mpc 
in comoving units for z = and z — 3) with a log-normal 
distribution. For large smoothing scale the density field is 
indeed adequately described by a log-normal distribution. 
However, for decreasing smoothing scale, the true PDF be- 
comes more and more skewed. The skewness of v which we 
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Figure 4. The probability distribution of DM density in the sim- 
ulation at 2 = (top panels) and 2 = 3 (bottom panels) in terms 
of u = [ln{l + S) — fj,i] / fj,2 (the density field was smoothed with a 
Gaussian windows of width 1.2Mpc and 5Mpc in comoving units, 
respectively). The moments /ii =< x >, ^2 =< (2; — A'l)'^ >i/2 
and /i3 =< (x — m)^ > where x = ln{l + S), are indicated 

on the plot. The dotted curves are the best-fitting log-normal 
distribution, while the triangles show the best-fitting Edgeworth 
expansion of the PDF as in equation ([l7|). 



Table 1. Moments of the density field. The symbols (.) and |.| 
denote average and absolute values respectively. Also x = S/a, 
where S = p/ <p>— 1. The top figures are for the true density 
filed, the middle figures are for the recovered density field and 
the bottom figures are for the best-fitting Edgeworth expansion 
as described in the text. The moments are computed assuming 
the true value of A. 
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define as ^3 =< / ^2 > is one way of quantifying the de- 
viation from a log-normal distribution and is also indicated 
on the plot. 

To obtain an improved fit to the PDFs of our simulation 
we use the first term of an Edgeworth expansion (Colombi 
1994, Juszkiewicz et. al. 1995), 



1 



1 / 3 

-^fli^l2[U 



3v) 



where G{h') is a Gaussian with zero mean and unit variance. 
By imposing the condition < S >= we find the following 
relation between the three parameters, 



^J■l 



3! / 



0. 



(18) 



The functional form (|lj) has therefore two free parameter 
(one more than a log- normal PDF). In the remainder of 
this section we arbitrarily choose to work with fi2 and /ia. 
We also tried to reduce ([17|) to a one-parameter family. We 
failed, however, at establishing a tight relation between the 
two parameters for the PDFs of our simulations, which holds 
at all redshifts (see also Colombi 1994). 

In practice we find it again more convenient to work 
with the CPDF rather than the PDF itself, 



Q(p) 



— — erfc 



(19) 



We fit the functional form ( |lS| ) for the CPDF with fj,i ex- 
pressed in terms of jj.2 and fj,^ ( p^ ) to Qo{p) (the CPDF 
computed directly from the recovered density) by minimiz- 
ing the quantity 



R = 



dp [Q{fi2,P3;p) - Qo{p)f 



(20) 



(17) 



with respect to ^2 and /13. The cutoff pmax is introduced 
in order to give no weight to high-density regions where 
the discrepancy between the recovered and true densities is 
severe. Fig. |^ and Fig. ^ show these fits as triangles. The 
quality of the fit is generally satisfying. At the bottom of 
table ^ we list the corresponding moments. The mean is 
correctly recovered by construction but the other moments 
are also in significantly better agreement with the moments 
of the true density field than those calculated directly from 
the recovered density field. The rms value at z = is now 
e.g. 2.97 as compared to 0.62 from direct calculation and 
3.52 for the true density field. When we fitted a log-normal 
distribution to the CPDF we found similar values for the 
moments as for the Edgeworth expansion. 



4.3 Determining A (oc Ql^JJ) 

So far we have assumed that we know the value of ^ . As 
is clear from equation (^ and (^ assuming a wrong value 
of A leads to an estimate of pdm/pdm which is wrong by 
a constant factor (.4/.4truc)^'^" independent of the density. 
It is therefore possible to directly estimate A by fitting one 
of the parametric forms discussed in the last section to the 
CPDF of the recovered density with A being left as an addi- 
tional free parameter. It turned out that the Edgworth ex- 
pansion is less suitable for this procedure as is already has 
two free parameters (it becomes then sometimes difficult to 
get a unique fit). Fig. 5 shows the residuals of such a fit for 
the log-normal distribution as a function of A at the two 
redshifts. The residuals change significantly with varying A 
and show pronounced minima but the values of A at the 
minima are biased low by about 25 percent and 15 percent, 
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Table 2. Sensitivity of the estimated A and the moments to 
uncertainties in a.The bottom line shows the effect of varying 
the mean flux from 0.69 (as in table 1) to 0.75 (a = 1.7 was 
assumed). Results are for z = 0. 
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Figure 5. Residuals of the fit of a log-normal distribution to the 
distribution function of the recovered density as a function of the 
ratio of assumed to true value of .4 . Solid and dotted lines refer 
to mock spectra generated with and without peculiar velocities, 
respectively. 

respectively. The error of A determined in this way is prob- 
ably of the same order as when A is estimated by adjusting 
the mean flux level in mock spectra generated from numer- 
ical simulations. We also show the case for a recovery from 
mock spectra with the peculiar velocities set to zero (dotted 
curves). The minima shift significantly. This demonstrates 
that the bias in A is not only due to the systematic under- 
estimate of the density in high-density regions and the in- 
accuracy of the log-normal distribution but also due to the 
effect of peculiar velocities. Peculiar velocities introduce a 
dependence of the bias on the amount of small-scale power 
in the density fluctuation spectrum as apparent from the 
comparison between the cases at 2 = 3 and z = which 
difl^er by a factor four in the rms density fluctuations. More 
numerical work is needed to quantify the effect of changing 
the cosmological model/power spectrum and the resolution, 
but it seems nevertheless feasible to correct for the bias in 
A to about 10 percent accuracy. 

The assumed value of A will obviously also affect the 
determination of the moments described in the last section. 



1 — "f^^^T^r^!;;^ — | 1 1 1 r 

0.2 - 




OM - ^ Z = 




-0.2 - \, Z=3 

I I I I I I I I I I I I 

0.5 1 1.5 

A/A, 

Figure 6. Moments of the density probability distribution de- 
termined by fitting an Edgeworth expansion as a function of the 
assumed value of A ■ All values are divided by their true values. 
Solid, long-dashed, short-dashed and dotted lines correspond to 

(T, 5*3, Ssr and 5*4^ as defined in table 1. The arrows indicate 
the position of the minima in Fig. 5 for the case with peculiar 
velocities. 

Figure 6 shows the dependence of the estimated moments 
on the assumed value of A. For a reasonable range of A 
the biases range from 20 to 50 percent. If A is determined 
by fitting a log-normal CPDF and the moments are then 
estimated by fitting am Edgeworth expansion to the CPDF 
with this value of A the biases in the moments are generally 
smaller than 25 percent. 

We have also run our inversion procedure with equation 
of states different from the one used to generate the mock 
spectra and changed the mean flux in order to test how sen- 
sitive the biases are to these parameters. The corresponding 
estimates of A and the moments are listed in table 2. At 
least for the parameter space explored the biases seem to be 
robust. 



5 DISCUSSION AND CONCLUSIONS 

We have used numerical simulations to demonstrate that the 
1.0. s. matter density field can be recovered from QSO ab- 
sorption spectra using an analytical model for the IGM and 
a Lucy-type iterative inversion algorithm. We have thereby 
estimated the l.o.s. peculiar velocity field from the recovered 
density field and have corrected for redshift distortions in an 
iterative procedure. The inversion works well for the most 
underdense regions up to the density where the absorption 
features saturate. For Lya this occurs at an overdensity of 
a few but this limit can be pushed to higher densities by 
incorporating higher order lines of the Lyman scries. , How- 
ever, an inversion will become increasingly more difficult at 
higher densities due to shell crossing and shock heating of 
the gas. 
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We have then used the fact that the density probabiUty 
distribution seems to have a universal shape to get reUable 
estimates of higher order moments of the PDF despite the 
missing high-density tail. We found that the Edgeworth ex- 
pansion is an excellent approximation to the PDF of the log- 
arithm of the density, consistent with the results of Colombi 
(1994). By fitting the first two terms of such an Edgeworth 
expansion we estimated a number of moments of the PDF 
with an accuracy of about 25 percent. Estimates of simi- 
lar accuracy were obtained for the parameter combination 
(Obar/i^)^ H{z)~^hy fitting to a log-normal distribution. 
We expect that, eventually, these biases can be corrected to 
10 percent accuracy. 

In principle it should also be possible to estimate the 
correlation function and the power spectrum directly from 
the recovered density field. These will, however, be affected 
by peculiar motions, the inability to recover high density 
regions and the bias in the determination of A . We have 
not yet investigated in detail how big the resulting errors 
are. It might well be that a comparison of observed flux 
power spectra with those of mock spectra as suggested by 
Croft et al. is the most favourable way to deal with these 
problems. 

Observational information on the PDF of the matter 
density comes so far mainly from the analysis of galaxy sur- 
veys and is restricted to small redshifts. These constraints 
could be checked and extended to a much wider redshift 
range by applying our inversion technique to a moderate 
number of QSO absorption spectra of the kind which are 
now routinely taken with 10m class telescopes. This should 
also check and tighten existing constraints on the evolution 
of the UV background. By combining our inversion tech- 
nique with a redshift survey along QSO sightlines the clus- 
tering of galaxies and matter can be related without re- 
ferring to a particular cosmological model. Such a deter- 
mination of the bias between galaxy and matter cluster- 
ing is especially worthwhile. Most currently favoured mod- 
els agree with the recently determined clustering strength 
of high-redshift galaxies but differ strongly in their predic- 
tions for the yet unknown bias relation (Adelberger et al. 
1998). Three-dimensional information on the density and pe- 
culiar velocity fleld can be gained by applying the method to 
two or more spatially close line-of-sights. An intermediate- 
resolution spectral survey of quasars down to about 22nd 
magnitude in a field of a couple of square degree size has 
e.g. been suggested as a possible project for the VLT (Pe- 
titjean 1996). Such a survey will result in about 100 l.o.s. in 
a region about 30Mpc across. With the method proposed in 
the appendix such a region could become a unique labora- 
tory for the study of how galaxy formation is related to the 
distribution and dynamics of the underlying matter field. 
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APPENDIX A: PECULIAR VELOCITIES AND 
3-D DENSITIES FROM LINE-OF-SIGHT 
DENSITY FIELDS 

Al From l.o.s. density to Lo.s. peculiar velocity 

In this section we describe the method which we used to 
estimate the l.o.s. peculiar velocity component from the (es- 
timated) real space density along the l.o.s. We first have to 
assume a non-linear relation between the 3-dimensional ve- 
locity and density fields. Various such relations have been 
suggested, here we work with the relation found empirically 
by Nusser et. al. (1991). Let 9 = — divv, where v is the 
physical peculiar velocity and the divergence is with respect 
to the physical coordinate r in units of km/s. The approxi- 
mation suggested by Nusser et al. (1991) is given by 

^ = t:t^ + =""^*- 

The relation is local and provides 9 in terms of the density 
contrast along the l.o.s. The constant factor is introduced 
to ensure < 9 >= 0. Since 9 involves derivatives of velocity 
components perpendicular to the l.o.s. , the l.o.s. peculiar 
velocity component can not uniquely be determined from 
9. Therefore, the best we can do is provide an estimate for 
the l.o.s. peculiar velocity based on some statistical assump- 
tions. Neither of the fields 6' or 5 is Gaussian. However, for an 
approximate treatment we assume here that 9 is Gaussian. 
Note that by (Al) this assumption does not imply Gaus- 
sian density fiuctuations. At the end of the next subsection 
we will briefiy describe how estimates of the peculiar ve- 
locity can be obtained without assuming that the field 9 is 
Gaussian. Let v be the component of v along the l.o.s. Sub- 
sequently we work with the (1-dimensional) Fourier tran- 
forms, 9{q) and v{q) of the l.o.s. fields v and 9. According 
to Bi (1993) 9 and v are related by 



9{q)^u{q)+a{q) 
v{q) = iqa{q)w{q), 



(A2) 
(A3) 



where w{q) and u{q) are two uncorrelated Gaussian fields 
with power spectra P^, and P„. These power spectra can be 
written, in terms of the 3D matter power spectrum P, 



/>oc 

P„(g) = 2Tva-^ / P(fe)fe"Mfc, 

J q 

and 

/•oo 

P„(g) = 27r / P{k)kdk-P^{q), 

J q 

with 

r°°p(fc)fc-3dfc 

a(9) = ^'^^ 



/°°P(fc)fc-idfe' 



(A4) 



(A5) 



(A6) 



The relations (A2) and (A3) can be used to estimate 
v{q) from 9{q). For simplicity, we work with w instead of v 
and switch back to v at the end of the calculation. Accord- 
ing to Bayes' theorem the conditional probability Pr[w|6] is 
given by 



Pr{w\9) = ^^p(e|i«) 
P(6») 

Using the relations ([A2|) and (|A3|) we find 



(A7) 



Pi{9\w) oc exp 
and 

Pr{w) oc exp 

Therefore 

Pi {w\9) oc exp 



{9-wf 



2Pu 



w 



{9~wf 



2P. 



w 

2pI 



This is a Gaussian with mean 
p X -1 



< w 9 > = 



1 + 



and variance (power spectrum) 
P 2 

J- w 



p , - — 

w\0 p I p 



Thus, given 9{q), one may write 

Pu- 



w{q) 



1 + 



-n{q), 



(A8) 



(A9) 



(AlO) 



(All) 



(A12) 



(A13) 



where n{q) is a random Gaussian variable with power s pec- 
trum P^|g which is uncorrelated with 9{q). The form of (All 



is reminiscent of Wiener filtering of noisy data (Wiener 1949, 
see also Zaroubi et. al. (1995) and Fisher et. al. (1995) for 
applications to cosmology). To see this simply identify Pu 
and Pw as, respectively, the signal and noise power spectra. 
According to (A2), the power spectrum of the unconditional 
^ is Pe = P^ + Pu- This clear ly di fi^ers from the power spec- 
trum of < > given in (A12). Following Sigadet. al. 



(1998) here we replace the filter (1 -|-Pu/Pw)" in (|All|) by 
(1 -I- Pu/Pw)^^''^ in order to preserve the power spectrum 
of the unconditional field. Alternatively we could have com- 
plemented < > with the random field n{q) in order to 
preserve the power. 



A2 Reconstruction of 3D fields 

For simplicity we discuss the case of a single l.o.s. The gen- 
eralisation to multiple parallel lines of sight is obvious. We 
arbitrarily choose the l.o.s. to be along the x axis. Once the 
density S^°^ in the l.o.s. is given, we can easily compute the 
corresponding Fourier transform 5^°^{q) defined by 



5'°=(g) = 



(27r) 



1/2 



5 °°(a;) exp(iga;) da; 



(A14) 



We write the 3D density field S{r) in terms of its Fourier 
transform 



5(r) = 



(27r)3/2 



(5(k)exp(-ik- r) d^k. 



(A15) 



Define l{x) — 5{r — xx) whe re a: is a u nit ve ctor along 
the a;— axis. Hence, by combining ( A14) and (A15) we obtain 



(2^) 



S^°^{q) = / (5(fc|| , ki) exp(i(7a; — ifc||a;) dfc|| d^ki da; 



5(g,kx)d2kx, 



(A16) 



where fey and kx are the components parallel and perpen- 
dicular to the l.o.s. The problem of reconstructing the 3D 
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density field reduces to estim ating 5(k) given the coeffi- 
cients 5^°%q) and the relation (A16). Note that 5'°° can be 
computed from 5'™. The method of Hoffman and Riback 
(1991) (hereafter HR) can be used to obtain such an es- 
timate. First, we derive an expression for the mean value 
6^^(k) =< (5(k)|{5'°=(g)} > of 5{k) corresponding to a par- 
ticular k given the set of coefficients {(5'°°(g)}. According 
to HR, we have 



5^^(k) 



M{],,q')N-'{q,q')5'°^{q)dqdq', 



(A17) 



where M(k,q') —< 5(k)5^°^{q) > is the covariance matrix 
of S and S^°^, and N~^{q,q') is the inverse of the covari- 
ance matrix N{q,q') =< 5'°''(?)(5'°°(g') >; both matrices 
are of infinite dimension. Using the homogeneity cond ition , 
< (5^°=(k)5'°=(«A;' >= 5^(k-k')P(fc) and the relation ( |A16| ) 
it can be seen that N and Af are given by 



5°{q. 



)P'°=(a), 



Niq,q') 

and 

M(k, q) = 5°(g - fc||)P (V^^+ki) 
where 5° is the Dirac 5— function and 



P(fc)fcdfc 



(A18) 



(A19) 



(A20) 



is the ID power spectrum of the density field along the l.o.s. 
The mean is given by 



5'^^(k) 



^^^^ 5'°'(fe||). 



pi°=(fe|l) 



(A21) 



Once the mean values are given, a constrained random 
realisation, A*"^ can be generated from an unconstrained ran- 
dom realisation A using (HR), 



A^(k) = A(k) 



P(fc) 
Pi-(fc|l) 



[5'-(fc|l) 



(A22) 



where A'°=(fc||) are the ID Fourier coefficients of the uncon- 
strained density field in the l.o.s. 

The treatment so far relied on the HR method which 
assumes Gaussian fields. This treatment will not be satis- 
factory for a 3D reconstruction in the no n-line ar regime. 
The unconstrained density field as given by ( A22 ) is e.g. not 
guaranteed to have positive values everywhere. The scheme 
can be extended to the non-linear regime by two modifica- 
tions: (i) use ln(l -I- S) instead of 5, (ii) extract an uncon- 
strained random field A(r) or its Fourier transform A(k) 
from a fully non-linear N-body simulation. The simulations 
can be used to compute the ID and 3D power spectra of 
ln(l + S) which are required to generate the constrained re- 
alisation. But note that these power spectra can in principle 
be estimated from the density field along the l.o.s. 

A 3D constrained realisation obtained with this scheme 
adapted to the non-linear regime can then be used to esti- 
mate the peculiar velocity without assuming 9 to be Gaus- 
sian. 9 can be computed at any point in space from the 3D 
constrained realization of the density field using the velocity- 
density relation (Al) and assuming a potential flow. The 
peculiar velocity is then obtained by solving a Poisson-like 
equation to recover 9. The implementation of this alterna- 
tive scheme for recovering the peculiar velocity fleld and a 



comparisons to the scheme described in the last section is 
left to future work. 



